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Abstract. We consider the directed Hausdorff distance between point sets in the 
plane, where one or both point sets consist of imprecise points. An imprecise point is 
modelled by a disc given by its centre and a radius. The actual position of an imprecise 
point may be anywhere within its disc. Due to the direction of the Hausdorff Distance 
and whether its tight upper or lower bound is computed there are several cases to 
consider. For every case we either show that the computation is NP-hard or we present 
an algorithm with a polynomial running time. Further we give several approximation 
algorithms for the hard cases and show that one of them cannot be approximated 
better than with factor 3, unless P=NP. 

1 Introduction 

The analysis and comparison of geometric shapes are essential tasks in vari- 
ous application areas within computer science, such as pattern recognition and 
computer vision. Beyond these fields also other disciplines evaluate the shape 
of objects such as cartography, molecular biology, medicine, or biometric sig- 
nal processing. In many cases patterns and shapes are modeled as finite sets of 
points. 

The Hausdorff distance is an important tool to measure the similarity between 
two sets of points (or, more generally, any two subsets of a metric space) . It is 
defined as the largest distance from any point in one of the sets, to the closest 
point in the other set (see Section 1.3 for a formal definition). This distance is 
used extensively in pattern matching. 

Data imprecision is a phenomenon that has existed as long as data is being col- 
lected. In practice, data is often sensed from the real world, and as a result has a 
certain error region. On the one hand, many application fields of computational 
geometry use algorithms that take this into account. However, these algorithms 
are mostly heuristics, and do not benefit from theoretical guarantees. On the 
other hand, algorithms from computational geometry are provably correct and 
efficient, often under the assumption that the input data is correct. If we want 
these algorithms to be used in practice, they need to take imprecision into ac- 
count in the analysis. Thus not surprisingly, data imprecision in computational 
geometry is receiving more and more attention. 
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In this paper, we study several variants of the important and elementary prob- 
lem of computing the Hausdorff distance under the Euclidean metric between 
imprecise point sets. 

1.1 Related Work 

The Hausdorff distance is one of the most studied similarity measures. For a 
survey about similarity measures and shape matching refer to [2]. A straight- 
forward, naive algorithm computes the Hausdorff distance between two point 
sets A and B consisting of m and n points, respectively, in 0(mn) time. Using 
Voronoi diagrams and a more sophisticated approach the running time can be 
reduced to 0((m + n) logn), [1]. 

The study of imprecision within computational geometry started around twenty 
years ago, when Guibas et al. [6] introduced epsilon geometry as a way to handle 
computational imprecision. In this model, each point is assumed to be at most 
e away from its given location. 

For a given measure on a set of imprecise points, one of the simplest questions 
to ask in this model is what are the possible output values? Each input point 
can be anywhere in a given region, and depending on where each point is, the 
output will have a different value. This leads to the problem of placing the points 
in their regions such that this value is minimised or maximised. One of the first 
results of this kind is due to Goodrich and Snoeyink [5], who show how to place a 
set of points on a set of vertical line segments such that the points are in convex 
position and the area or perimeter of the convex hull is minimised in 0{n 2 ) time. 
A similar problem is studied by Mukhopadhyay et al. [10], and their result was 
later generalised to isothetic line segments [9] . 

Nagai and Tokura [11] thoroughly study the efficient computation of lower and 
upper bounds for a variety of region shapes and measures; in particular they 
study the diameter, the width, and the convex hull, and all their algorithms 
run in O(nlogn) time. However, not all of their bounds are tight. Van Kreveld 
and Loffler [12] study the same problems and give algorithms to compute tight 
bounds, though the running times of the algorithms can be much higher and 
some variants are proven to be NP-hard. 

1.2 Contribution 

In this paper, we assume that an imprecise point is modelled by a disc with a 
given centre and radius. In general, it is possible that the discs intersect. We 
assume we have two sets of points, P and Q, and that at least one of them is 
imprecise. We want to compute the directed Hausdorff distance from P to Q. 
This includes both the tight lower and upper bound on the possible values, for 
each combination. This leads to six different cases. Additionally, in some settings 
the problems become easier if we restrict the model of imprecision to disjoint 
discs or discs that all have the same radius; we state these results separately. 
Our results are summarised in Table 1. 
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setting 


tight lower bound 


tight upper bound 


h(P, Q) [general] 
h(P, Q) [general] 

[disjoint unit discs] 
h(P, Q) [general] 

[const, depth in P] 


0(n 2 ) 

NP-hard*, 4-APX in 0(n 3 log 2 n) 
3-APX-hard, 3-APX in 0(n 10 log n) 
NP-hard 


0(n log n) 
0(n log n) 
0(n log n) 
0(n 2 ) 
0(n log n) 



Table 1. P and Q are point sets and P and Q are imprecise point sets. Results are 
shown for the case when all sets have 0(n) elements. *can be computed exactly in 



0(n 3 ) if the discs are disjoint and the answer is smaller than r(\/5-2V3-l)/2 where 
r is the radius of the smallest disc in Q. 

In the next section, we review some definitions and structures that we use to 
obtain our results. After that, we present our three main results. In Section 2, 
we give a general algorithm for computing the upper bound, which works in all 
settings in the table, though it can be simplified (conceptually) in some settings. 
In Section 3, we prove hardness of computing the lower bound in most settings. 
Finally, in Section 4, we give algorithmic results for computing the lower bound, 
exactly in some cases and approximately in others. Due to space constraints 
some proofs and details can be found in the appendix. 



1.3 Preliminaries 

The directed Hausdorff distance h from a point set P = {p±, . . . ,p m } to a point 
set Q = {<7i, . . . , q n } with an underlying Euclidean metric can be computed in 
O ((n + to) logn) time, see [1], and is defined as (see Fig. 1 for an example): 



h(P,Q) 



max mm p 



/ 




Fig. 1. (a) h(P, Q) is defined by the pair of 
points indicated by the arrow, (b) An ex- 
ample input of imprecise points. 



Let P and Q denote two imprecise • • Q 
point sets consisting of to and n * 
closed discs respectively. We call 
a set P — {pi, . . . ,p m } a precise 
realisation of P = {pi, . . . ,p m } if 
Pi G Pi for all i. We also write P <e 
P in this case. 
We define the directed Hausdorff 

distance between a precise and an imprecise or two imprecise point sets as the 
interval of all possible outcomes for that distance. 

h(P, Q) = {h(P, Q)\Qm Q}, h(P, Q) = {h(P, Q)\P^P} 
h{P, Q) = {h(P, Q)\P(^P,Q<£Q} 

Further, we denote the tight upper and lower bounds of this interval by /i ma x 
and hxnin respectively, for example 



h max {P, Q) = maxh(P, Q) and hence h(P, Q) = [h min (P, Q),h max (P, Q)]. 
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2 Algorithm for computing the tight upper bound 

In this section, we consider the fol- 
lowing problem. Given are two set of 
discs P and Q. The radii may be all 
different; an example input is shown 
in Fig. 1(b). We want to place point 
sets P <e P and Q <s Q such as 
to maximise the directed Hausdorff 
distance h(P, Q). In other words, we 
want to place the points in P and Q 
such that one point from P is as far 
as possible away from all points in 
Q. The placements of the remaining 
points of P do not matter. So, we 
need to identify which point p E P 
will play this important role. 

2.1 Basic algorithm 

We will first compute the inverted additive Voronoi Diagram (iaVD) of Q. This 
is a subdivision of the plane into regions where each point x in the plane is 
associated with the disc in Q whose furthest point is closest to x. See Fig. 2(a) 
for an example. This diagram can be computed in 0(n log n) time [4], since it 
corresponds to the additively weighted Voronoi Diagram (also known as Apol- 
lonius diagram) of the centres of Q, where the weight of a point is minus the 
radius of the corresponding disc. 

Using the iaVD, we can place each point p E p E P at a locally optimal position, 
as if it were p. We identify three possible placement types for p that are locally 
optimal, as is illustrated in Fig. 2. 

1. A vertex of the iaVD. 

2. An intersection point between a Voronoi edge and a disc from P. 

3. A point on the boundary of p that is furthest away from the iaVD site whose 
cell contains the centre of p 

We can now iterate over all points in P and their locally optimal placements, 
and determine p by keeping track of the locally optimal placement p E p such 
that the shortest distance between p and (the furthest point on) any disc in Q 
is maximised. Once p is known, we place all points in Q as far away from p as 
possible, and all points in P\{p} anywhere inside their discs. The result is shown 
in Fig. 4(b). As it is possible that there are 0(mn) locally optimal placements of 
the second type (namely: an intersection between a disc boundary and a Voronoi 
edge), we conclude with the following theorem. 

Theorem 1. Given two sets P and Q of imprecise points of size m and n, 
respectively, we can compute h max (P,Q) and precise realisations P <g P and 
Q <e Q with h(P, Q) = h max (P, Q) in 0(nm + rtlogn) time. 




Fig. 2. (a) The inverted additive Voronoi 
Diagram (iaVD) of Q. The point set P 
placed locally optimal, (b) The points in 
Q arc all placed as far away from p as pos- 
sible. 
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2.2 Faster algorithms in special cases 

In this section we show how the above result can be 
improved under certain assumptions. To speed up 
the algorithm, we make some observations about 
the nature of locally optimal placements. 

Lemma 1. Let p be a disc in P, and let q\ and q 2 

be two discs in Q, such that the part of the bisector 
of qi and q 2 that is in the iaVD slices through p 
(that is, it is not connected to a vertex of the iaVD 
inside p). Then the optimal placement of p occurs 
on the same side of this bisector where the centre of 
p is, regardless of what the rest of the iaVD looks 
like. 

Proof. Some notation: let p c be the centre of p, q c \ 
the centre of qi and q C 2 the centre of q 2 ■ Now let f\ 
be the point on the boundary of p that is furthest 
away from q c \ (this would be the type 3 placement 
if qi was the only player), and similarly let f 2 be 
the point furthest away from q c2 . Now, suppose w.l.o.g. that p c is on the same 
side as q c \. Now, suppose that the optimal placement p is on the other side, 
that is, on the side of q c2 . Then we observe that f 2 must be on the side of q c \, 
because q c2 , p c and f 2 lie on a line. This means that along the boundary of p, the 
intersection points with the bisector have a better value than any other point on 
the side of q c2 , in particular, better than p, which is a contradiction. (Note that 
if there are other cells of the iaVD involved, the value of p could only be lower). 

This lemma basically says that if we want to place a certain point p locally 
optimally, we can start looking by walking from the centre of p and never have 
to cross edges of the iaVD that slice through p. Like illustrated in Fig. 3. This 
makes us arrive at the following conclusion. 

Corollary 1. Let p be a disc in P, and suppose that the iaVD has t vertices 
inside p. Then we can find the locally optimal placement for p in 0(t) time. 

This immediately implies that if the discs of P do not overlap, we can simply 
place all points p independently in linear time. 

Now, assume that the discs of P are disjoint, or that the intersection depth is 
at most some constant c. Then, clearly, each vertex of the iaVD can appear in 
at most c discs of P. So, if each disc pi contains U vertices of the iaVD, we have 
X^i — cn > an d we can find all locally optimal placements in 0{n) time. 

Theorem 2. Given two sets P and Q of imprecise points of size m and n, 
respectively, where the discs in P have constant intersection depth, we can com- 
pute /i max (P, Q) and precise realisations P (s P and Q d Q with h(P,Q) = 
h ma , x (P, Q) in 0{{m + n) log(m + n)) time. 
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Fig. 3. (a) There could be 
a quadratic number of inter- 
sections between the edges 
of the iaVD of Q and the 
discs in P. (b) When the 
discs overlap, the union of P 
has fewer intersections with 
the iaVD. 
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The algorithm described in this 
section works in the most gen- 
eral setting. However, in some 
more specific settings, the algo- 
rithm can be simplified. For ex- 
ample, when the discs of Q are 
unit discs, the iaVD is simply 
the normal Voronoi diagram. 
When P is not imprecise, there 
are of course only m possible 
locations for p, and we do not 
need to look for all three placement types. This results in the running times as 
indicated in Table 1. 

3 Hardness results for tight lower bounds 

In this section, we consider a transformation from the known NP-complete prob- 
lem planar 3- SAT [8] to the problem of computing h min (P,Q) for a set P of 
points and a set Q of discs with radius r. In the PLANAR 3-SAT problem, we are 
given as input a 3-SAT formula / with the additional property that the graph 
G(f) is planar, where G(f) has a vertex for each variable and each clause in /, 
and there is an edge between a variable vertex and a clause vertex if the variable 
occurs in the clause. Having the boolean formula / and a planar embedding of 
G(f), the transformation is as follows (see Fig. 5(a,b) for a general overview): 
For each variable vertex v in G(/), we construct a cycle C of alternating points 
in P and discs in Q. The distance between consecutive points and discs is e, such 
that r — 2.5e (see Fig. 5(c)). There may be bends up to a certain angle, and also 
other geometric features necessary to connect cycles and chains. When looking 
only at the points P c and discs Q c corresponding to a cycle C, we observe that 
by the construction of C, there are two realisations Qq,Qi <e Q such that 
h(P c ,Qq) = e and h(P c ,Qi) — e. These two realisations represent the two 
possible boolean values the variable for that cycle can have. 
For each edge {v,c} in G(/), we construct a chain of alternating points in P 
and discs in Q with distance e (see Fig. 5(d)). The chain connects the cycle 
corresponding to the variable v and the representation of a clause c. One end of 
this chain is a disc that will be part of a representation of clause c (see Fig. 5(e)), 
the other end is a point p that is placed near a disc q G Q of a variable cycle 
such that p has distance e to either Qq n q or Qf n q (see Fig. 5(e)). 
Each clause vertex in G{f) is represented by three discs and one additional point 
p* , such that the disc centres lie on the vertices of an equilateral triangle, and 
the point has distance e to each of the discs. The three discs are ends of chains 
that connect to cycles that correspond to the three literals in the clause. 

Theorem 3. Let P be a precise point set and Q be an imprecise point set of 
pairwise disjunct discs. It is NP-hard to compute a 5 -approximation of the di- 
rected Hausdorff distance h m i n (P, Q) for 1 < § < 3. 




Fig. 4. (a) An example input, (b) The optimal 
output, shown as a set of circles covering Q. 
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Fig. 5. (a) Planar embedding of G(f), circles represent variables and rectangles rep- 
resent clauses, (b) Rough overview of how G(f) is transformed into P and Q, some 
details are misrepresented, the chain starts with p followed by q' , all other points and 
discs belong to the cycle. (c,d) Two realisations (representing opposite boolean values) 
with Hausdorff distance e of chains, cycles and connections, (e) Connection of a chain 
to a cycle, 

4 Algorithms for tight lower bounds 



In this section we present algorithms for computing the minimum of h(P, Q) 
and h(P, Q). As we have seen in the previous section, the latter problem is NP- 
hard and even hard to approximate in some settings. In the following we give 
a 4-approximation for the general case, an optimal 3-approximation for disjoint 
discs and an algorithm for the case which is not NP-hard when the Hausdorff 
distance is small. Many results in this section rely on similar ideas. Therefore, 
we will describe several (sub-) algorithms with different approximation factors 
and running times depending on the value d of the optimal solution. Afterwards, 
we discuss how to apply them to obtain the results claimed in Table 1. 



4.1 Algorithm PlaceTogether 




In this section, we describe an algo- 
rithm for the case where we have an 
imprecise point set P and a precise 
point set Q. We place all points in 
P as close to a point in Q as possi- 
ble. Fig. 6(a) shows an example. For 
each pair (p, q) with p £ P and q € Q 
we could simply compute the place- 
ment p £ p minimizing the Haus- 
dorff distance and keep track of the 
longest distance over all pairs. This 
takes 0{mn) time. However, in practice it is probably better to compute the 
Voronoi diagram of Q first, and locate the discs of P in it. In the worst case, 
each disc could still intersect linearly many Voronoi cells (although the input 
needs to be contrived for this). Also, note that as soon as a disc from P is dis- 
covered to contain a point from Q, we can stop the computation and just place 
the point there. 



Fig. 6. (a) Placing points in P. (b) Discs of 
radius at most c can only intersect at most 
two discs of Q. 
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Theorem 4. Let P denote an imprecise point set consisting of m discs and Q 
denote a precise point set consisting of n points. The tight lower bound of h{P, Q) 
can be computed in 0(mn) time. 

4.2 Subalgorithm Candidates 

In the case where P is precise and Q is imprecise, we start with a simple subal- 
gorithm Candidates to establish the following lemma. The algorithm proving 
this lemma can be found in Appendix B. The result will be used later. 

Lemma 2. Let P denote a precise point set consisting of m points and Q denote 
an imprecise point set consisting of n discs. Lt is possible to reduce the possible 
values of h min (P, Q) to 0(m 3 + m 2 n) many candidates in 0(m 3 + m 2 n) time. 

4.3 Algorithm IndependentSets 

This algorithm computes exactly the Hausdorff distance from a precise point set 
P to an imprecise point set Q when the distance is small. This is an exception 
to the general NP-hardness of that setting. 

First we compute the set of possible candidates for h min (P, Q) by Candidates 
and discard all values greater or equal than c = r(y / 5-2y / 3- l)/2. Now we 
perform a binary search on the remaining values in order to determine the small- 
est value d for which the predicate described below evaluates to true. If such a 
candidate exists, the algorithm returns d as the value of the bound. Otherwise 
^min (P,Q) >r( V / 5-2V3-l)/2. 

Let p(d) denote the disc of radius d around a point p E P. There must be at 
least one point of Q in p(d) to which p can be matched within a distance smaller 
or equal than d. The computation of the predicate relics on two observations: 
All considered values are so small that no p(d) intersects more than two discs 
of Q. Note that a disc that intersects more than two disjoint discs of Q has a 
radius of at least r(2/V3— 1), which is greater than c, see Fig. 6(b). Thus, there 
are at most two possible matching partners for each point p E P. 
The second observation is that each p(d) has to intersect at least one disc of Q, 
otherwise the Hausdorff distance would be greater than d at p. 
We define a point p E 
P to have degree 1 if 
p(d) intersects just one 
disc q E Q and to have 
degree 2 if it intersects 
two discs of Q. 
The predicate tests, 
whether ft m i n (P, Q) < 
d. To this end we asso- 
ciate with each q~i E Q 
a, feasible region Fi and 
a set d called children 




Fig. 7. A point q 6 q may only be placed within its feasible 
region (green), (a) The two left points of P have degree 1, 
the right point has degree 2. (b,c) The points have degree 
two. Both cases show a scenario which allows to match the 
points locally, i.e. only considering the set D and its two 
corresponding feasible regions. 
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of qi. The feasible region contains the valid placements of qi £ q^. The children of 
a point qi are the points of P that can only be matched to % because otherwise 
h m in(Pj Q) would be greater than d. In other words, the children of demand 
that qi is placed in its feasible region Fi, see Fig. 7(a). 

We restrict the feasible regions and children in an iterative manner with the 
help of the sub-algorithms Remove-degree-1-discs and Remove-degree-2- 
DISCS, which can be found in C.l. If a feasible region turns out to be empty, 
the computation stops and the predicate returns false. The first sub-algorithm 
considers only points p of degree 1 and restricts the feasible regions of the discs 
intersected by p{d). The second sub-algorithm computes which p{d) of degree 2 
can also only by stabbed by one disc in Q by considering all p{d) which intersect 
the same two feasible regions, see Fig. 7(b) and 7(c). Afterwards there are only 
points p unmatched whose disc p(d) can be stabbed in two valid feasible regions. 
Furthermore, two discs p(d) cannot intersect if they belong to two different pairs 
of feasible regions, because we only consider distances d < r(yh — 2\/3 — l)/2. 
Thus, it is possible to check for a valid point matching of the remaining points 
by computing the maximum matching in a bipartite graph (see section C.l for 
a detailed description). Finally, we make use of a maximum matching in order 
to check, whether all points can be matched within the distance d. 
It is simple to return a matching which realises the Hausdorff distance which the 
predicate proved to be realisable. Therefore, we first consider the feasible regions 
which are adjacent to a vertex in the maximum matching. We place the point 
in such a feasible region such that it intersects all discs in the adjacent set D of 
discs. For all other q\ £ Q we place their point qi somewhere within its feasible 
region Fj. 

Theorem 5. Let P denote a precise point set consisting of m points and Q 
denote an imprecise point set consisting of n disjoint discs. Algorithm Inde- 
pendentSets computes whether the tight lower bound for h(P, Q) is smaller 
than r(^5-2V3 - l)/2 where r is the radius of the smallest disc in Q. If 
this is the case, the exact value of h m i n {P, Q) is computed. The running time is 
0(m 3 + m 2 n + nlog 2 n) . 

4.4 Algorithm GrownDiscs 

In this section, we present an ap- 
proximation algorithm for precise 
P and imprecise Q. As a subrou- 
tine in this algorithm, we assume 
that we are given an algorithm 
that computes a c-approximation 
to the geometric fc-centre prob- 
lem (see Section 1.3), in time 
T(k,n). We need this because 
when we have k discs of Q which 
partially overlap, and there are n 




Fig. 8. (a) A set P of precise points and a set 
Q of imprecise points, (b) The optimal output. 
A set of circles of radius d is shown around the 
points in Q, which cover the points in P. 
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Fig. 9. (a) The arrangement A formed by the expanded discs Q' . (b) Each cell of the 
arrangement is determined by the indices of the discs it lies inside. (c,d) Four disjoint 
discs of radius r of Q such that the arrangement of enlarged circles of radius l|r have 
a common intersection, while with five discs, this is not possible anymore. 

points of P in the overlap, computing a lower bound on the Hausdorff distance 
for this subset is exactly solving the geometric fc-centre problem. Using this sub- 
routine, we will show how to get a (c + 2)-approximation to our problem in time 
0(m 3 + m 2 n + n 2 T(k, m) log(m + n)). Fig. 8 shows an example of the problem. 
We first compute the set of possible values of the Hausdorff distance using Algo- 
rithm Candidates , followed by a binary search on the resulting candidate val- 
ues in order to determine the smallest value d for which the predicate described 
below evaluates to true. For any value d, the decision will return a solution with 
distance at most (c + 2)d if a solution of value d exists. Assume that this is the 
case. We grow the discs in Q by d, and consider the resulting arrangement of 
enlarged discs A. Fig. 9(a) shows an example of this. We observe that all points 
of P need to be inside some cell of this arrangement, otherwise there exists no 
solution of distance d. Now each cell of the arrangement contains a subset of 
the points from P, which are covered by a number of circles of radius d, see 
Fig. 9(b). Now we can compute an approximate solution independently in each 
cell of A by applying the approximation algorithm for geometric fc-centre. This 
provides us with a number of circles per cell of the arrangement. Each cell can 
also be identified with a subset of the discs of Q whose enlarged discs contain 
this cell. To solve the problem, we need to find a matching between the discs 
of Q and the circles that cover P. In Appendix D we describe more details of 
the decision algorithm, which runs in 0{n 2 T{k,m) + ran + m^/m) time. For 
the total running time, we first spend 0(m 3 + m 2 n) time to execute Algorithm 
Candidates and compute the possible values of d. So, the total time we spend 
is 0(m 3 + m 2 n + (n 2 T(k, m) + mn + my/m) log(m + n)). 



Theorem 6. Let P denote a precise point set consisting of m points and Q 
denote an imprecise point set consisting of n discs. Given a c-approximation to 
the geometric k-covering problem that runs in T(k,m) time, we can compute 
a (c + 2) -approximation to the tight lower bound of h(P, Q) in 0(m 3 + m 2 n + 
(n 2 T(fc, m) + mn+my/m) log(m+n)) time, where k < n is an internal parameter 
of the optimal solution. 
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4.5 Putting the algorithms together 

For the remainder of this section let r mm and r max denote the radius of the 
smallest and largest disc in Q. When P is precise and Q is imprecise, we note that 
by Theorem 6 Algorithm GrownDiscs immediately presents a 4-approximation 
for the case when the discs may have different radii and overlap, which we obtain 
by plugging in a 2- approximation algorithm for geometric /c-covering that runs 
in O(mlogfc) time [3]. The running time of the entire algorithm then becomes 
0(m 3 + m 2 n + ran 2 log(m + n) logn) in the worst case. 

We can improve this algorithm by first testing whether v < c ■ r m i n using Al- 
gorithm IndependentSets and Theorem 5, without increasing the asymptotic 
running time. If it is, then we can actually compute the exact solution. 
Furthermore, when the discs are disjoint and all have the same size, we can im- 
prove this result to a 3-approximation by combining Algorithm GrownDiscs 
and a trivial algorithm called CentrePoints which simply places every impre- 
cise point at the centre of its disc. First we test whether v > r/2 = r max /2, by 
applying CentrePoints and checking whether the resulting Hausdorff distance 
is larger than 3/2r. If it is, we are done. Otherwise, note that each cell of A is 
a subset of the intersection of k < 4 discs, because Q's discs are disjoint and 
v < r/2, see Fig. 9(c) and 9(d). Therefore, by Theorem 6 we can obtain a 3- 
approximation from Algorithm GrownDiscs by plugging in an exact algorithm 
to solve the geometric fc-covering problem. 

We can solve the geometric 4-covcring problem exactly by computing the ar- 
rangement circles around the points to be covered or radius d. The arrangement 
has quadratic complexity Then we need to find out whether there are three cells 
that are together in all cells. There are 0(m 8 ) such combinations to test, and 
by keeping track of which discs are already taken care of each can be tested in 
constant time. So, using this algorithm, we have a 1 + 2 = 3-approximation to 
the original problem for disjoint unit discs. The total running time now becomes 
0(n 2 m s log(m + n)). 

Theorem 7. Let P denote a precise point set consisting of m points and Q 
denote an imprecise point set consisting of n disjoint discs of the same radius. 
The tight lower bound for h(P, Q) is 3-approximable in time 0(m 3 + m 2 n + 
n log 2 n) . 

5 Conclusions and Future Work 

We studied computing tight lower and upper bounds on the directed Hausdorff 
distance between two point set, when at least one of the sets has imprecision. We 
gave efficient exact algorithms for computing the upper bound, prove that com- 
puting the lower bound is NP-hard in most settings, and provide approximation 
algorithms. Furthermore, we show that in one special case, our approximation 
algorithm is optimal. In other settings, a gap in the factor between the hardness 
result and approximation still remains. When both sets are imprecise, we don't 
have any constructive results for the lower bound. 
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All our results hold for the directed Hausdorff distance. An obvious next step 
would be to extend them to the undirected Hausdorff distance. We can imme- 
diately solve the upper bound problem in that case using our results, since it 
is just the minimum of the two directed distances. However, computing lower 
bounds seems to be more complicated, because there one needs to find a single 
placement of both point sets that minimises the distance in both directions at 
the same time. 

Other directions of future work include looking at other underlying metrics than 
the Euclidean metric, other similarity measures than the Hausdorff distance, or, 
as is common in shape matching, allowing some transformation of the point sets. 
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Fig. 10. (a) the endings of three chains arranged to representing a clause; (b) connec- 
tion of a chain to a cycle with geometric details, the chain starts with p followed by 
q' , all other points and discs belong to the cycle; (c) two realisations with Hausdorff 
distance e of the structure in the left subfigure; (d) two chains connecting to the same 
cycle where the chains tap opposite boolean values; 

A Proof of Theorem 3 

Fig. 10 gives additional details necessary for the proof and the transformation. 

Proof (of Theorem 3). For a given instance / to the PLANAR 3-SAT problem, let 
G(f) be the planar graph corresponding to /, embedded such that all variables 
are on a line, and all clauses are on cither side of them, see Fig 5(a) (G(f) 
can always be drawn in this way [8]). From this embedding, we compute (as 
described in Section 3) a set P of precise points, a set Q of imprecise points, and 
numbers e > and r = 2.5e. 

Claim (1). If / is satisfiable, then h m i n (P, Q) = e. 

Proof. We consider an assignment with boolean values of the variables in /, such 
that / is satisfied, and we need to show that there exists a realisation Q g Q, 
such that h(P, Q) < e. (Note that by construction, there is no realisation Q' <s Q, 
such that h(P, Q') < e.) For each cycle C of a variable, we choose either Qq or 
Qi as realisations of the imprecise point set Q c , depending on whether the 
variable is false or true. Discs on chains are realised in the following way: at 
the ending of the chain that connects to a cycle C, we have a point p near a 
disc q £ Q c , and q is realised by a point q. The next object along the chain 
is a disc <f (see Fig. 10(b)). We realise q' in either of two ways as depicted in 
Fig. 10(c), depending on whether the distance between p and q is equal to e or 
greater than e. This corresponds to a variable being either true or false. And 
the boolean value of the corresponding literal is then propagated to the other 
end of the chain to a clause c. Since / is satisfiable, there is at least one literal 
in each clause that satisfies the clause. Hence, there is at least one chain with 
a realisation such that the point p* has distance at most e to a point of this 
realisation. 

□ 

Claim (2). If h m i n (P,Q) < 3e, then / is satisfiable. 
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Proof (of Claim (2)). We consider a realisation Q <s Q with h(P,Q) < 3e, and 
we need to construct a variable assignment that satisfies /. We first observe 
that the only way where two points in P can be matched to the same point 
q € q € Q is where a chain connects to a cycle. (Otherwise, the distance between 
the two points in P is larger than 6e, and hence, they cannot be matched to 
the same point q.) And in this case, one of the points in P is the end point of a 
chain, the other point in P belongs to a cycle, and q belongs to the same cycle 
(see Fig. 10(b)). From this we make an observation about how the points along 
chains and cycles are matched to discs along the same chains and cycles. Let us 
consider the sequence po, qo,Pi, qi,P2, 92, ■•■ of points and discs ordered along a 
fixed cycle C. Exactly one of the following two things is true for all z — 0, 1, 2, ... 
(modulo length of C): 

— pi is matched to a point qi € g,, i.e. < 3e; or 

— Pi is matched to a point qi-\ G Qi—i, i.e. \\pi, < 3e 

In other words, each point on C is matched to the next disc on C in clockwise 
order, or each point on C is matched to the next disc on C in counter-clockwise 
order, but there is no mix of these along C. From these two possibilities for cycle 
C, we derive the boolean value of the variable corresponding to C . This assign- 
ment is in accordance with the two realisations Qq and Qf (as defined above), 
which represent false and true. What is left to show is that this assignment sat- 
isfies /. To see this, we consider any clause c of / and argue that c is satisfied. 
From the construction, we know that c is represented by one point p* e P and 
three discs being the endings of three chains. There must be a point q € Q such 
that < 3e, and q must lie in one of the discs that represent the clause c. 

This disc q is the ending of a chain qo,po, qi,Pi, <Z2, —,Pj- In a similar way as 
above, we conclude for this chain that: 

— p* is matched to a point qo € qo, i.e. ||f>*,<7o|| < 3e; and 

— pi is matched to a point qi + i € q\+i, for i = 0, — 1; and 

— pj is matched to a point qj e qj, for some disc qj on some cycle C 

The variable corresponding to C has a boolean value, according to the realisation 
of the discs along C. Depending on whether this variable occurs negated or non- 
negated in the clause c, the chain qo,Po, <ii,Pi, <?2, —,Pj is connected to the cycle 
C, such that "the boolean value true is propagated along the chain" . Hence, by 
construction we have that the boolean value of the variable corresponding to C 
satisfies the clause c. 

□ 

We conclude the proof of the theorem by observing that the construction can be 
done without any intersection between discs and/or points, and such that chains 
and/or cycles are far enough apart from each other not to interfere. We also note 
that the size of P and Q is polynomial in the size of G(f), which follows from 
our planar embedding of G{f). 

□ 
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(a) (b) (c) 

Fig. 11. There are only polynomially many candidates for the infimum of h(P,Q) 
which are determined by (a) one point of P, (b) by two points of P or (c) three or 
more points of P. 

B Subalgorithm Candidates 

Let q € q 6 Q be a placement of an imprecise point in Q which realises the 
Hausdorff distance d. The distance d can be determined by q together with one, 
two, or three points of P. If d is determined by one point of P there are 0(mn) 
possibilities, see Fig. 11(a). If d is determined by two points Pi,P2 £ P the point 
q lies on the bisector of the line segment P1P2, see Fig. 11(b), for which 0(nm 2 ) 
possibilities exist. Finally, if d is determined by three (or more) points all these 
points lie on a circle whose centre is q. Thus there are 0(m 3 ) possible locations, 
see Fig. 11(c). The algorithm simply computes and returns all 0(m 3 + m 2 n) 
locations in 0(m 3 + m 2 n) time. 



C Additions to IndependentSets 

C.l Subalgorithms 

The following two sub-algorithms restrict the feasible region and children of each 
disc q € Q in an iterative manner. 

Before calling Remove- degree- 1-discs we define a set P R of all points which 
are not matched so far and set P R := P. 
Remove-degree- 1-discs 

1 forall iji e Q do 

2 set Fi := q t 

3 set Ci := 

4 while there is some p G P such that p(d) intersects only one Fi do 

5 set Fi := F t C\p{d) 

6 if Fi = then 

7 return false 

s set d := C % U {p} 
9 set P R := P R \ U, Q 
10 Remove-degree- 2-discs 
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(a) (b) 

Fig. 12. (a) The gray circle in the middle has the minimal radius of r(2/V5 — 1) 
which is necessary to intersect at least three discs of Q, where r denotes the radius of 
the blue discs. Because all considered candidates for the minimal Hausdorff distance 
are smaller than the radius of the gray disc, there are at most two possible matching 
partners for each point in P. (b) The black circle has the maximal radius c for which 
no two circles intersecting two different pairs of discs in Q can be stabbed by only one 
point q £ q. Since it has to intersect two discs in Q, its centre must lie within the red 
lune of these discs grown by c. Furthermore, its boundary must not intersect the green 
segment denoted by r because it could intersect another black circle intersecting the 
upper two discs of Q, otherwise. Using the cosine formula it holds that (r + 2c) 2 < 
r 2 + (2r) 2 -2r2r cos ~ . Solving the latter inequality for c yields c < r(Vo — 2\/3 — 1)/2. 

In line 9 the remaining points Pr = P \ [J^ C\ are points whose disc p(d) in- 
tersects exactly two feasible regions. It is still possible to match points in P to 
points in Q by only analysing their local environment. This is done by Remove- 
2-DISCS. 

Remove-degree-2-discs 

1 foreach pair of feasible regions (Fi,Fj),i ^ j do 

2 compute the set D of discs p(d) intersecting both Fi and Fj 

3 if 

4 D can be stabbed by one point of either Fi or Fj or 

5 D needs one point of Fi and one point of F2 to be stabbed 

6 then 

7 restrict F t and Fj accordingly 

8 if Fi = V Fj = then 

9 return false 

10 Remove-degree- 1-Discs 

11 BUILDGRAPH 

Note, that all sets D of line 2 partition the set of the discs p(d) of the points 
in Pr . Line 7 restricts the matching for points of degree 2 whose matching does 



The directed Hausdorff distance between imprecise point sets 



17 



not interfere with the matching of points with other pairs of feasible regions. 
See Fig. 7(b) and 7(c) for an illustration of the two possible scenarios allowing 
a local matching. 

In line 11 all discs of each subset D can be stabbed by only one point of the 
two feasible regions Fi and Fj they intersect. Further, it holds that no two 
discs p(d) of different sets D can be stabbed by only one point, because d < 
r(v / 5 -2V3- l)/2, sec Fig. 12(b). 

Thus, it is possible to check for a valid point matching of the remaining points 
in Pr by computing the maximum matching in a bipartite graph as follows. 
Buildgraph builds a bipartite graph on the feasible regions and the sets D 
of the partition of the discs p(d) of the points in Pr. For each cell D of the 
partition there are two edges in the graph connecting D with the two feasible 
regions that the discs in D intersect . We now compute a maximum matching on 
that graph. If this matching connects all D-vertices with a feasible region, the 
predicate returns true and the bound for the Hausdorff distance is smaller or 
equal than v. Otherwise the predicate returns false. 

C.2 Running time 

The algorithm consists of three phases: It first computes a polynomial set of can- 
didates which takes 0(m 3 + m 2 n) time. On this set we perform a binary search 
using the predicate. The computation of the predicate is done by some recur- 
sive calls of the two sub-algorithms Remove-degree- 1-discs and Remove- 
degree- 2-DISCS. These need to know the intersections of the discs p(d) with 
the feasible regions, which are the discs in Q in the beginning. We store these 
intersections distributed with every point peP and store references with each 
feasible region to the p(d) it intersects. The initial set of the intersections can be 
computed using a sweep-line in 0(m + n) log(m + n) time. The restrictions of the 
feasible regions can take at most O(m) time. Further we maintain one point set 
containing points p € P with degree 1 and a second point set for points of degree 
2. We move a point from the second to the first point set if its degree is decreased. 
Thus, having the initial intersection set, all calls of Remove-degree-1-discs 
without line 10 take 0(m) time. 

The sub-algorithm Remove-degree-2-DISCS needs to iterate over all pairs of 
feasible regions. Instead of considering all possible pairs we only maintain a set of 
region pairs which indeed intersect some discs p(d). Because all D's partition the 
points in P there are at most m discs to consider in the stabbing analysis from 
line 1 to 6, thus Remove-degree-2-discs needs 0(m) time per call. Since it is 
called at most m times by Remove-degree- 1-DISCS its overall running time is 
0(m 2 ). 

Finally we need to compute a maximum matching in the bipartite graph. Using 
the algorithm of Hopcroft and Karp [7] this needs 0(m^/m + n 2 ) time. 
Putting all together we get a running time of 0(m 3 + m 2 n + ((in + n) log(m + 
n) + m 2 ) log(m 3 + m 2 n) +m^/m + n 2 ) which can be simplified to 0(m 3 + m 2 n + 
nlog 2 n). 
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D Decision algorithm used in GrownDiscs 

Let d be any given positive value. We will describe a decision algorithm that 
returns, if there exists a solution to our problem with distance at most d, a 
solution with distance at most (c+2)d. If no solution of distance at most d exists, 
the algorithm either still returns a solution with distance at most (c+ 2)rf, or it 
returns false. 

Let qi , . . . , q n be the discs in Q where disc <ji has radius . We define the grown 
disc q[ to be the disc with the same centre point as qi, but with radius ri + d. 
We call the resulting set Q' . 

Observation 1 If P is not covered by Q' , there exists no solution of value d. 
So, we assume P is covered by Q' . We can test this easily, and immediately 
return false if this is not the case. Now, we compute the arrangement A of Q', 
which has quadratic complexity in the worst case. Fig. 9(a) shows an example 
of the arrangement formed by the discs Q' . If / C {1, . . . ,n} is a certain set 
of indices, denote by Ai the cell of the arrangement in the intersection of all 
discs {q\ | i £ I}, but not inside any other disc. (Of course, most of these cells 
do not exist, since there is only a quadratic number of cells.) Each cell of this 
arrangement contains a subset of P; we define Pi to be the set of points of P 
inside Ai. Fig. 9(b) shows an example. 

Now, assume that there exists a solution of value at most d. 

Observation 2 Let I be a set of indices. In the optimal solution Q <s Q, all the 

points of Pi are covered by circles of radius d around the points in {qi \ i e /}. 

Proof. Since the optimal solution has Hausdorff distance h(P, Q) < d, we know 
that each point p € P is covered by some circle of radius d around a point 
q € Q. Now assume that p e Pi. Then we know that \pq\ < d, and q e q, 
therefore p e q' . So, by definition of Ai, q must be qi for some i £ I. 

a 

This observation suggests we can solve the problem somehow separately in each 
cell of A. For a given cell Ai, the optimal solution uses ki < \I\ circles of radius 
d (centred around points of Q) to cover the points in Pi . Now, we could compute 
such a set of circles (most likely a different set) by applying the c-approximation 
algorithm for geometric fc-centre. This would provides us with a set Ci of k\ < ki 
circles of radius cd. However, there is a problem with this approach. The solutions 
are not independent: it is possible that a certain circle of the optimal solution 
covers points from two different cells of the arrangement. This means we may 
have constructed more than n circles. 

So, what we do instead is this. We process the cells of the arrangement in any 
order. For the first cell Ai, we compute a set Ci of at most ki circles of radius 
cd that cover Pi. Now, we grow our circles until they have radius (c+ 2)d. This 
ensures that any points of P outside Ai that were covered by discs of the optimal 
solution that were covering any points of Pi, are now also covered by Ci. Fig. 13 
illustrates this. 

A second complication comes from the fact that we required the centres of the 
circles to be in Q, not just in Q' . In order to ensure this, we simply move the 
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Fig. 13. (a) A cell Ai of the arrangement, and a set of two circles of radius d covering 
Pi in the optimal solution, (b) A different set of at most two circles of radius cd covering 
Pi, as produced by the subroutine, (c) The enlarged circles with radius (c + 2)d also 
cover all points outside Ai that could be covered by the circles of the optimal solution. 

circle centres to the closest point in their discs, moving them by at most d. Since 
the circles are enlarged by 2c?, the moved and enlarged circles will still cover all 
points of P that were covered by the original circles. Fig. 14 shows this case. 
Furthermore, this case does not interfere with the case described above, because 
a circle cannot at the same time be close to the boundary of Ai and far enough 
away from it not to cover a point that is covered by a circle that also covers a 
point from a neighbouring cell. 

For each next cell, we only consider those points that have not been covered yet, 
and otherwise proceed in the same way. 

This procedure results in a set C of at most n circles, composed of a set Gj for 
each cell Ai of the arrangement. This set has the property that each Cj contains 
no more circles than the corresponding set in the optimal solution. This implies 
that there exists a matching between C and Q' in the graph that has an edge 
between circle c and disc q[ if c is in a set Cj where i <E I. Clearly, this means 
that the centre of c is inside q[. Since an optimal matching exists, we can also 
compute one efficiently (although it may be a different one). 
For each value d, we spend 0{n 2 ) time to compute A, and T(fcj, \Pi\) time per 
cell to solve the geometric /c-centre problem. If k is the largest value of ki over 
all I, then a crude upper bound for this is n 2 T(k,m). As seen in the previous 




Fig. 14. (a) A circle of radius cd covers a number of points of Pi inside a certain cell 
Ai of the arrangement. The centre q of the circle lies inside Ai, but not inside the 
region q. (b) The point q has been moved into q, but now some points of Pi that were 
covered are no longer covered, (c) The enlarged circle of radius (c + 2)d covers the 
points again. 
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section, a matching can be computed in 0(mn + m^/rn) time [7]. So, we spend 
0(n 2 T(k, m) + ran + ra^fm) time in total on the decision algorithm. 



